Machine Speech Chain
نویسندگان
چکیده
منابع مشابه
Machine Speech Chain with One-shot Speaker Adaptation
In previous work, we developed a closed-loop speech chain model based on deep learning, in which the architecture enabled the automatic speech recognition (ASR) and text-to-speech synthesis (TTS) components to mutually improve their performance. This was accomplished by the two parts teaching each other using both labeled and unlabeled data. This approach could significantly improve model perfo...
متن کاملMan-Machine Interaction Using Speech
ed acoustic descriptions to an acoustically matching response. This is where the requirement for information at a higher level than the acoustic level arises, for out of all the pattern groupings that could be learned, only those that are meaningful are learned. Sutherland [145] argues the points involved in such a model of perception very cogently, for the visual case. We do not, at present, k...
متن کاملField Testing the Tongues Speech-to-Speech Machine Translation System
The Tongues portable, rapid-development, speech-to-speech machine translation system was developed specifically to allow a realistic field-test of a deployable prototype. In this paper we will describe the system, its field-testing using regular US Army officers and naive Croatians, and the evaluation of these tests. The evaluation includes analysis of answers to a questionnaire, analysis of sy...
متن کاملStatistical Natural Language Generation for Speech-to-Speech Machine Translation
This paper presents a statistical natural language generation scheme for trainable speech-to-speech machine translation (MT) systems for limited domain applications using a cascaded approach. The natural language generation scheme in the translation systems is based on a maximum entropy (ME) statistical model fully trained from a corpus, allowing flexible translation outputs. In this paper, the...
متن کاملEnriching machine-mediated speech-to-speech translation using contextual information
Conventional approaches to speech-to-speech (S2S) translation typically ignore key contextual information such as prosody, emphasis, discourse state in the translation process. Capturing and exploiting such contextual information is especially important in machine-mediated S2S translation as it can serve as a complementary knowledge source that can potentially aid the end users in improved unde...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE/ACM Transactions on Audio, Speech, and Language Processing
سال: 2020
ISSN: 2329-9290,2329-9304
DOI: 10.1109/taslp.2020.2977776